Learning Meaning without Primitives: Typology Predicts Developmental Patterns
نویسندگان
چکیده
Does the cognitive naturalness of concepts affect the acquisitional path of meaning? In this paper, we explore the use of crosslinguistically elicited data to approximate cognitive naturalness, following Gentner and Bowerman’s (2009) Typological Prevalence Hypothesis. Using the domain of topological spatial relations as a case study, we show how this kind of data allows us to simulate developmental patterns of order of acquisition and overgeneralization in Dutch. This result suggests that the Typological Prevalence Hypothesis can be computationally operationalized and evaluated, that modeling semantic acquisition without hand-coded semantic primitives is possible, and finally, that crosslinguistic data provides a good source of information to do so. The acquisition of meaning How does a child acquire the language-specific ways of conceptualizing the world that are necessary to form the meaning constructs of her language? Does non-linguistic maturation of the conceptual system play a central role, or is it language that is in the driver’s seat? Gentner and Bowerman (2009) (henceforth: GB) argue that both are true to some extent: language plays a role in constructing subspaces of the conceptual space that constitute the meanings of the language a child learns (Bowerman & Choi, 2001), but the conceptual system is not a blank-slate when language learning commences. Some dimensions of categorization come more natural to the child, being perhaps cognitively more basic, whereas others are harder for the child to grasp (Casasola & Cohen, 2002). Bridging these two insights is GB’s Typological Prevalence Hypothesis (TPH):1 All else being equal, within a given domain, the more frequently a given way of categorizing is found in the languages of the world, the more natural it is for human cognizers, hence the easier it will be for children to learn. (Gentner & Bowerman, 2009, 467) In this paper, we present a computational model that acquires the meaning of linguistic expressions using crosslinguistically elicited data. The goals of this endeavor are twofold: first, to operationalize the TPH, and second, to explore a novel method for studying meaning without hand-coded semantic primitives. Using only crosslinguistic distributional patterns, the approach is similar to distributional semantic approaches to meaning, which use the textual distribution of linguistic items in a corpus to approximate their meaning, often from a cognitive perspective (e.g. Mitchell & Lapata, 2010; Louwerse, 2011). Rather than using corpus data from a single Bowerman (1993) first proposed the TPH, but not by that name. See Jakobson (1971) for a congenial idea for phonology. language, we use crosslinguistically elicited data for a fixed set of stimulus situations. Developing methods circumventing the use of manual features is desirable: although we agree with Xu and Kemp (2010) that discovering crosslinguistically valid semantic primitives is indispensable, this approach also has a fundamental methodological problem. The selection of a finite set of discrete primitives is bound to reflect a coder’s bias. As long as we are not sure how the coder’s culturally informed conceptual grid influences coding practices, there is no independent ground truth of what the correct primitives are. Using the same set of stimuli for all informants and letting the elicited data speak for itself largely obviates this problem.2 Using crosslinguistic data furthermore provides a novel basis for computational approaches to meaning, in particular the acquisition of form-meaning pairings. Most acquistional modeling studies pair utterances with primitives that are derived from the language itself (Fazly, Alishahi, & Stevenson, 2010), taken from resources like WordNet (Alishahi, Fazly, Koehne, & Crocker, 2012), or hand-coded on the basis of video data (Fleischman & Roy, 2005; Beekhuizen, Fazly, Nematzadeh, & Stevenson, 2013). Given the crosslinguistic variation noted in papers like Brown (1994) and Bowerman and Choi (2001), all of these methods are very prone to reflect the subdivisions of the conceptual space English makes. The learner’s subdivision of the conceptual hypothesis space is thus already fixed and set to the target language in these approaches. In this study, we show how this can be avoided and more language-neutral primitives can be selected. We apply our model to the domain of topological spatial relations and simulate GB’s findings about the acquisition of prepositions in Dutch: namely that the crosslinguistically common grouping constituting the meaning of the preposition op is acquired earlier than and overgeneralized to two prepositions reflecting crosslinguistically less common groupings, aan and om. We show how the model is able, first, to learn the extensional meaning of Dutch prepositions reasonably well, and, most importantly, that it simulates the order of acquisition as well as the developmental pattern of overgeneralization GB observed. Finally, we show that this effect is not merely due to the frequencies of the prepositions. Typological data as a proxy for meaning The use of crosslinguistic data to explore the conceptual space onto which languages map their linguistic forms, havAny bias is then moved to the construction of a set of stimuli for which linguistic material is elicited (cf. Lucy, 1997), which is less severe, as the categorization of situations is left to the informants. ing its roots in work on color vocabulary (Berlin & Kay, 1969), is a recent technique that has been applied to several semantic phenomena, such as expressions of cutting and breaking events (Majid, Boster, & Bowerman, 2008) and markers of topological space (Levinson, Meira, & The Language and Cognition Group, 2003; Regier, Khetarpal, & Majid, 2013), the phenomenon studied here. In these approaches, informants speaking different languages are asked to describe a fixed set of stimuli. Mathematical techniques for extracting latent information in the data are applied to this data in order to explore the main loci of variation. These techniques show information in the data that would be hard to find ‘by hand’. We use similar techniques, but go beyond their exploratory use, by employing the extracted dimensions directly as a conceptual space within which a learner constructs the semantic categories of her language. We use the dataset of Levinson et al. (2003), which consists of elicited markers of topological relations for a dataset of 71 pictures of such relations developed by Bowerman and Pederson (1992). The elicitation was done in 9 genetically unrelated languages,3 for a varying number of participants (between 1 and 26) per language, where each participant was asked to label each situation in his or her own language. This dataset constitutes a matrix in which the rows are the 71 situations and each of the 120 columns is a pair of a language and a marker in that language. The cells are filled with the counts of how many informants used that adposition for that situation. We used all responses to the stimuli, including secondary ones, so as to get as much within-language variation as possible. As this matrix contains crosslinguistic counts of situation-adposition mappings, the similarity between pairs of situations (on the rows) reflects how often certain situations are described with the same adposition across languages. Following the TPH, this information is thought to reflect the cognitive naturalness of the groupings. We use Principal Component Analysis (PCA; Hotelling (1933)) to extract underlying dimensions (components) from the data matrix, where each component represents a combination of the dimensions in the original matrix such that crosslinguistic patterns of similarity in the classification of situations will surface.4 PCA iteratively extracts a vector of coordinates defining a line in the high-dimensional space of the data matrix (an eigenvector or component) that has the largest variance or eigenvalue given all previously extracted components, until all variance is covered. Applying PCA, we extract 70 components, where the first component accounts for 24% of the variance in the 120 dimensions of the original data matrix, the second for 19% and so on. What does the PCA-transformed space look like? We focus Basque, Dutch, Ewe, Lao, Lavukaleve, Tiriyó, Trumai, Yélı̂ Dnye, Yukatek. Unlike Levinson et al. (2003), who use dimension-reduction techniques to explore the conceptual space, we do not normalize or filter the data by using only modal responses, because by doing so, we lose information about the within-language variation. Also, by minimizing the number of operations on the data, we minimize possible alternative explanations of the results. Figure 1: Values on components 1 and 3 for situations with modal responses in, aan, op and om. Numbers correspond to the numbering of the situation in Levinson et al. (2003). component 1 co m po ne nt 3 −10 0 10 20 −2 0 2 4 6
منابع مشابه
Learnable vs
Regine Lai University of Maryland Abstract This study provides empirical evidence for Heinz’s (2010) Subregular Hypothesis, which predicts some gaps found in the typology of phonotactic patterns are due to learnability. More specifically, only phonotactic patterns with specific computational properties are humanly learnable. The present study compares the learnability of two long-distance harmo...
متن کاملDevelopmental stages of perception and language acquisition in a perceptually grounded robot
The objective of this research is to develop a system for language learning based on a ‘‘minimum’’ of pre-wired language-specific functionality, that is compatible with observations of perceptual and language capabilities in the human developmental trajectory. In the proposed system, meaning (in terms of descriptions of events and spatial relations) is extracted from video images based on detec...
متن کاملTalmy’s Dichotomous Typology and Japanese Lexicalization Patterns of Motion Events
Talmy‘s (1985) crosslinguistic typology of lexicalization patterns of motion events have been extensively used in second language acquisition (SLA) research as a means to examine how second language (L2) learners map form, meaning, and function. These studies have yielded some conflicting results regarding the learnability of L2 lexicalization patterns arguably the oversimplification over and...
متن کاملWhat is Intrinsic Motivation? A Typology of Computational Approaches
Intrinsic motivation, centrally involved in spontaneous exploration and curiosity, is a crucial concept in developmental psychology. It has been argued to be a crucial mechanism for open-ended cognitive development in humans, and as such has gathered a growing interest from developmental roboticists in the recent years. The goal of this paper is threefold. First, it provides a synthesis of the ...
متن کاملFine-Grained Lexical Semantic Representations And Compositionally-Derived Events In Mandarin Chinese
Current lexical semantic representations for natural language applications view verbs as simple predicates over their arguments. These structures are too coarse-grained to capture many important generalizations about verbal argument structure. In this paper, I specifically defend the following two claims: verbs have rich internal structure expressible in terms of finer-grained primitives of mea...
متن کامل